Search CORE

84 research outputs found

Parameterized Object Sensitivity for Points-to Analysis for Java

Author: Ana Milanova
Atanas Rountev
Barbara G. Ryder
Publication venue
Publication date: 01/01/2002
Field of study

The goal of points-to analysis for Java is to determine the set of objects pointed to by a reference variable or a reference object field. We present object sensitivity, a new form of context sensitivity for flow-insensitive points-to analysis for Java. The key idea of our approach is to analyze a method separately for each of the object names that represent runtime objects on which this method may be invoked. To ensure flexibility and practicality, we propose a parameterization framework that allows analysis designers to control the tradeo#s between cost and precision in the object-sensitive analysis

CiteSeerX

Differential Privacy for Coverage Analysis of Software Traces (Artifact)

Author: Bassily Raef
Hao Yu
Latif Sufian
Rountev Atanas
Zhang Hailong
Publication venue: DARTS - Dagstuhl Artifacts Series. DARTS, Volume 7, Issue 2, Special Issue of the 35th European Conference on Object-Oriented Programming (ECOOP 2021)
Publication date: 01/01/2021
Field of study

We propose a differentially private coverage analysis for software traces. To demonstrate that it achieves low error and high precision while preserving privacy, we evaluate the analysis on simulated traces for 15 Android apps. The open source implementation of the analysis, which is in Java, and the dataset used in the experiments are released as an artifact. We also provide specific guidance on reproducing the experimental results

Dagstuhl Research Online Publication Server

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam J.
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan P.
Publication venue
Publication date: 01/12/2013
Field of study

Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance.Comment: Transaction on Architecture and Code Optimization (2014

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Register Optimizations for Stencils on GPUs

Author: Pouchet Louis-Noël
Rastello Fabrice
Rountev Atanas
Sadayappan Ponnuswamy
Singh Prashant
Sukumaran-Rajam Aravind
Publication venue: HAL CCSD
Publication date: 24/02/2018
Field of study

International audienceThe recent advent of compute-intensive GPU architecture has allowed application developers to explore high-order 3D stencils for better computational accuracy. A common optimization strategy for such stencils is to expose sufficient data reuse by means such as loop unrolling, with the expectation of register-level reuse. However, the resulting code is often highly constrained by register pressure. While current state-of-the-art register allocators are satisfactory for most applications, they are unable to effectively manage register pressure for such complex high-order stencils, resulting in sub-optimal code with a large number of register spills. In this paper, we develop a statement reordering framework that models stencil computations as a DAG of trees with shared leaves, and adapts an optimal scheduling algorithm for minimizing register usage for expression trees. The effectiveness of the approach is demonstrated through experimental results on a range of stencils extracted from application codes

INRIA a CCSD electronic archive server

Associative Instruction Reordering to Alleviate Register Pressure

Author: Pouchet Louis-Noël
Rastello Fabrice
Rountev Atanas
Sadayappan Ponnuswamy
Singh Rawat Prashant
Sukumaran-Rajam Aravind
Publication venue: HAL CCSD
Publication date: 11/11/2018
Field of study

International audienceRegister allocation is generally considered a practically solved problem. For most applications, the register allocation strategies in production compilers are very effective in controlling the number of loads/stores and register spills. However, existing register allocation strategies are not effective and result in excessive register spilling for computation patterns with a high degree of many-to-many data reuse, e.g., high-order stencils and tensor contractions. We develop a source-to-source instruction reordering strategy that exploits the flexibility of reordering associative operations to alleviate register pressure. The developed transformation module implements an adaptable strategy that can appropriately control the degree of instruction-level parallelism, while relieving register pressure. The effectiveness of the approach is demonstrated through experimental results using multiple production compilers (GCC, Clang/LLVM) and target platforms (Intel Xeon Phi, and Intel x86 multi-core)

INRIA a CCSD electronic archive server

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam Jagannathan
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan Ponnuswamy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2013
Field of study

International audienceEmerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance

INRIA a CCSD electronic archive server

Static control-flow analysis for reverse engineering of UML sequence diagrams

Author: Atanas Rountev
Publication venue: ACM Press
Publication date: 01/01/2005
Field of study

UML sequence diagrams are commonly used to represent the interactions among collaborating objects. Reverse-engineered sequence diagrams are constructed from existing code, and have a variety of uses in software development, maintenance, and testing. In static analysis for such reverse engineering, an open question is how to represent the intraprocedural flow of control from the code using the control-flow primitives of UML 2.0. We propose simple UML extensions that are necessary to capture general flow of control. The paper describes an algorithm for mapping a reducible exceptionfree intraprocedural control-flow graph to UML, using the proposed extensions. We also investigate the inherent tradeoffs of different problem solutions, and discuss their implications for reverse-engineering tools. This work is a substantial step towards providing high-quality tool support for effective and efficient reverse engineering of UML sequence diagrams. 1

CiteSeerX